AITopics | fairlet decomposition

The fairness of clustering algorithms has gained widespread attention across various areas, including machine learning, In this paper, we study fair $k$-means clustering in Euclidean space. Given a dataset comprising several groups, the fairness constraint requires that each cluster should contain a proportion of points from each group within specified lower and upper bounds. Due to these fairness constraints, determining the optimal locations of $k$ centers is a quite challenging task. We propose a novel ``Relax and Merge'' framework that returns a $(1+4\rho + O(\epsilon))$-approximate solution, where $\rho$ is the approximate ratio of an off-the-shelf vanilla $k$-means algorithm and $O(\epsilon)$ can be an arbitrarily small positive number. If equipped with a PTAS of $k$-means, our solution can achieve an approximation ratio of $(5+O(\epsilon))$ with only a slight violation of the fairness constraints, which improves the current state-of-the-art approximation guarantee. Furthermore, using our framework, we can also obtain a $(1+4\rho +O(\epsilon))$-approximate solution for the $k$-sparse Wasserstein Barycenter problem, which is a fundamental optimization problem in the field of optimal transport, and a $(2+6\rho)$-approximate solution for the strictly fair $k$-means clustering with no violation, both of which are better than the current state-of-the-art methods. In addition, the empirical results demonstrate that our proposed algorithm can significantly outperform baseline approaches in terms of clustering cost.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2411.01115

Country:

Asia > China > Anhui Province > Hefei (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

Fair Clustering Through Fairlets

Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Sergei Vassilvitskii

Neural Information Processing SystemsOct-3-2024, 23:07:28 GMT

We study the question of fair clustering under the disparate impact doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the k-center and the k-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions--for instance a point may no longer be assigned to its nearest cluster center! En route we introduce the concept of fairlets, which are minimal sets that satisfy fair representation while approximately preserving the clustering objective. We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms. While finding good fairlets can be NP-hard, we proceed to obtain efficient approximation algorithms based on minimum cost flow. We empirically demonstrate the price of fairness by quantifying the value of fair clustering on real-world datasets with sensitive attributes.

algorithm, decomposition, fairlet decomposition, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Italy > Lazio > Rome (0.04)

Industry: Law (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Fair Hierarchical Clustering

Ahmadian, Sara, Epasto, Alessandro, Knittel, Marina, Kumar, Ravi, Mahdian, Mohammad, Moseley, Benjamin, Pham, Philip, Vassilvitskii, Sergei, Wang, Yuyan

arXiv.org Machine LearningJun-18-2020

As machine learning has become more prevalent, researchers have begun to recognize the necessity of ensuring machine learning systems are fair. Recently, there has been an interest in defining a notion of fairness that mitigates over-representation in traditional clustering. In this paper we extend this notion to hierarchical clustering, where the goal is to recursively partition the data to optimize a specific objective. For various natural objectives, we obtain simple, efficient algorithms to find a provably good fair hierarchical clustering. Empirically, we show that our algorithms can find a fair hierarchical clustering, with only a negligible loss in the objective.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2006.10221

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Fair Correlation Clustering

Ahmadian, Sara, Epasto, Alessandro, Kumar, Ravi, Mahdian, Mohammad

arXiv.org Artificial IntelligenceMar-2-2020

In this paper, we study correlation clustering under fairness constraints. Fair variants of $k$-median and $k$-center clustering have been studied recently, and approximation algorithms using a notion called fairlet decomposition have been proposed. We obtain approximation algorithms for fair correlation clustering under several important types of fairness constraints. Our results hinge on obtaining a fairlet decomposition for correlation clustering by introducing a novel combinatorial optimization problem. We define a fairlet decomposition with cost similar to the $k$-median cost and this allows us to obtain approximation algorithms for a wide range of fairness constraints. We complement our theoretical results with an in-depth analysis of our algorithms on real graphs where we show that fair solutions to correlation clustering can be obtained with limited increase in cost compared to the state-of-the-art (unfair) algorithms.

algorithm, correlation, fcost, (13 more...)

arXiv.org Artificial Intelligence

2002.02274

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Sicily > Palermo (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

Fair Clustering Through Fairlets

Chierichetti, Flavio, Kumar, Ravi, Lattanzi, Silvio, Vassilvitskii, Sergei

arXiv.org Machine LearningFeb-15-2018

We study the question of fair clustering under the {\em disparate impact} doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the $k$-center and the $k$-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions---for instance a point may no longer be assigned to its nearest cluster center! En route we introduce the concept of fairlets, which are minimal sets that satisfy fair representation while approximately preserving the clustering objective. We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms. While finding good fairlets can be NP-hard, we proceed to obtain efficient approximation algorithms based on minimum cost flow. We empirically quantify the value of fair clustering on real-world datasets with sensitive attributes.

artificial intelligence, decomposition, machine learning, (16 more...)

arXiv.org Machine Learning

1802.05733

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Law (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback